Document Image Compression and Analysis Your Full Name
نویسنده
چکیده
Title of Dissertation: Your Dissertation Title Your Full Name, Doctor of Philosophy, 1997 Dissertation directed by: Academic title and name of advisor Department of Mathematics Image compression usually considers the minimization of storage space as its main objective. It is desirable, however, to code images so that we have the ability to process the resulting representation directly. In this thesis we explore an approach to document image compression that is e cient in both space (storage requirement) and time (processing exibility). A representation is presented in which component-level redundancy is removed by forming a prototype library and component location table. This representation forms a basis for compression and provides direct access to image components. To generate the prototype library, a new clustering approach is developed which is suitable for document image components. The distance metric is based on a character degradation model so that degraded versions of the same character will be grouped together. To achieve a lossless representation when required, the residuals are encoded e ciently using a structural distance ordering. OCR is then used as a measure of readability to evaluate the rate distortion tradeo for lossy compression. A set of algorithms is presented for typical document processing applications which operate e ectively on the compressed representation. Applications demonstrated include subdocument retrieval, skew estimation, keyword search and document image matching. Extensions of the paradigm to grayscale and graphic document images, networking and multimedia objects are discussed.
منابع مشابه
Document Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملPersian Printed Document Analysis and Page Segmentation
This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...
متن کاملFig. 5 High Resolution X-sar Image
(Hong Kong) as a test picture for the compression algorithm. Left-standard quicklook. Center-full resolution image. Right-proposed quasi full resolution quicklook (image epitome), compression factor C= 680, the same as the standard quicklook. Fig. 6 A query for fields using a Gibbs model of order 5 and a relative high degree of similarity 0.2 resulted in finding 4 images from the existing 484 o...
متن کاملVoxel-based mapping of lesion-behavior relationships
The purpose of this document is to provide a practical overview of best practices for voxelbased analysis of the relationship between brain lesion data and behavior, an approach which Bates et al. (2003) have labelled “VLSM” for voxel-based lesion-symptom mapping. Although we don't always think of the behavioral measures as symptoms per se, the name “VLSM” is catchy and established, so I'll use...
متن کاملImplementation of VlSI Based Image Compression Approach on Reconfigurable Computing System - A Survey
Image data require huge amounts of disk space and large bandwidths for transmission. Hence, imagecompression is necessary to reduce the amount of data required to represent a digital image. Thereforean efficient technique for image compression is highly pushed to demand. Although, lots of compressiontechniques are available, but the technique which is faster, memory efficient and simple, surely...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997